Overview

Dataset statistics

Number of variables13
Number of observations2969
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory301.7 KiB
Average record size in memory104.0 B

Variable types

Numeric12
Unsupported1

Alerts

gross_revenue is highly correlated with invoice_no and 5 other fieldsHigh correlation
recency_days is highly correlated with invoice_noHigh correlation
invoice_no is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_quantity is highly correlated with gross_revenue and 4 other fieldsHigh correlation
total_quantity is highly correlated with gross_revenue and 5 other fieldsHigh correlation
avg_ticket is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_basket_size is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_un_basket_size is highly correlated with avg_ticketHigh correlation
qty_returns is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_quantity is highly skewed (γ1 = 51.34460134) Skewed
avg_ticket is highly skewed (γ1 = 53.44422362) Skewed
qty_returns is highly skewed (γ1 = 51.79774426) Skewed
avg_basket_size is highly skewed (γ1 = 44.67271661) Skewed
df_index has unique values Unique
customer_id has unique values Unique
avg_recency_days is an unsupported type, check if it needs cleaning or further analysis Unsupported
recency_days has 34 (1.1%) zeros Zeros
qty_returns has 1481 (49.9%) zeros Zeros

Reproduction

Analysis started2022-10-08 16:07:19.863685
Analysis finished2022-10-08 16:07:51.557966
Duration31.69 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2317.292354
Minimum0
Maximum5715
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:51.705983image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile185.4
Q1929
median2120
Q33537
95-th percentile5035.2
Maximum5715
Range5715
Interquartile range (IQR)2608

Descriptive statistics

Standard deviation1554.944589
Coefficient of variation (CV)0.6710178739
Kurtosis-1.010787014
Mean2317.292354
Median Absolute Deviation (MAD)1271
Skewness0.342284058
Sum6880041
Variance2417852.674
MonotonicityStrictly increasing
2022-10-08T13:07:51.881971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
30111
 
< 0.1%
29961
 
< 0.1%
29991
 
< 0.1%
30001
 
< 0.1%
30011
 
< 0.1%
30021
 
< 0.1%
30051
 
< 0.1%
30071
 
< 0.1%
30081
 
< 0.1%
Other values (2959)2959
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57151
< 0.1%
56961
< 0.1%
56861
< 0.1%
56801
< 0.1%
56591
< 0.1%
56551
< 0.1%
56491
< 0.1%
56381
< 0.1%
56371
< 0.1%
56271
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.77299
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:52.058005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.4
Q113799
median15221
Q316768
95-th percentile17964.6
Maximum18287
Range5940
Interquartile range (IQR)2969

Descriptive statistics

Standard deviation1718.990292
Coefficient of variation (CV)0.1125673398
Kurtosis-1.206094692
Mean15270.77299
Median Absolute Deviation (MAD)1488
Skewness0.03160785866
Sum45338925
Variance2954927.624
MonotonicityNot monotonic
2022-10-08T13:07:52.216005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
175881
 
< 0.1%
149051
 
< 0.1%
161031
 
< 0.1%
146261
 
< 0.1%
148681
 
< 0.1%
182461
 
< 0.1%
171151
 
< 0.1%
166111
 
< 0.1%
159121
 
< 0.1%
Other values (2959)2959
99.7%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182771
< 0.1%
182761
< 0.1%
182741
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182691
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2954
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2749.321711
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:52.392572image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.77
Q1570.96
median1086.92
Q32308.06
95-th percentile7219.68
Maximum279138.02
Range279131.82
Interquartile range (IQR)1737.1

Descriptive statistics

Standard deviation10580.62331
Coefficient of variation (CV)3.848448607
Kurtosis353.944724
Mean2749.321711
Median Absolute Deviation (MAD)672.16
Skewness16.77755612
Sum8162736.16
Variance111949589.6
MonotonicityNot monotonic
2022-10-08T13:07:52.542464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178.962
 
0.1%
533.332
 
0.1%
889.932
 
0.1%
2053.022
 
0.1%
745.062
 
0.1%
379.652
 
0.1%
2092.322
 
0.1%
731.92
 
0.1%
1353.742
 
0.1%
3312
 
0.1%
Other values (2944)2949
99.3%
ValueCountFrequency (%)
6.21
< 0.1%
13.31
< 0.1%
151
< 0.1%
36.561
< 0.1%
451
< 0.1%
521
< 0.1%
52.21
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
140450.721
< 0.1%
124564.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.28763894
Minimum0
Maximum373
Zeros34
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:52.711462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.75677911
Coefficient of variation (CV)1.209513686
Kurtosis2.777962659
Mean64.28763894
Median Absolute Deviation (MAD)26
Skewness1.798379538
Sum190870
Variance6046.116697
MonotonicityNot monotonic
2022-10-08T13:07:52.872432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.3%
487
 
2.9%
285
 
2.9%
385
 
2.9%
876
 
2.6%
1067
 
2.3%
966
 
2.2%
766
 
2.2%
1764
 
2.2%
1655
 
1.9%
Other values (262)2219
74.7%
ValueCountFrequency (%)
034
 
1.1%
199
3.3%
285
2.9%
385
2.9%
487
2.9%
543
1.4%
766
2.2%
876
2.6%
966
2.2%
1067
2.3%
ValueCountFrequency (%)
3732
0.1%
3724
0.1%
3711
 
< 0.1%
3681
 
< 0.1%
3664
0.1%
3652
0.1%
3641
 
< 0.1%
3601
 
< 0.1%
3591
 
< 0.1%
3584
0.1%

invoice_no
Real number (ℝ≥0)

HIGH CORRELATION

Distinct56
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.723139104
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:53.062432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.85653132
Coefficient of variation (CV)1.547495379
Kurtosis190.8344494
Mean5.723139104
Median Absolute Deviation (MAD)2
Skewness10.76680458
Sum16992
Variance78.43814702
MonotonicityNot monotonic
2022-10-08T13:07:53.300426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2785
26.4%
3499
16.8%
4393
13.2%
5237
 
8.0%
1190
 
6.4%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.9%
Other values (46)332
11.2%
ValueCountFrequency (%)
1190
 
6.4%
2785
26.4%
3499
16.8%
4393
13.2%
5237
 
8.0%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.9%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

avg_quantity
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct2732
Distinct (%)92.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.25312811
Minimum1
Maximum26999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:53.537440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.6
Q16.515151515
median10.32061069
Q314.96875
95-th percentile48.67160105
Maximum26999
Range26998
Interquartile range (IQR)8.453598485

Descriptive statistics

Standard deviation505.698578
Coefficient of variation (CV)16.71557983
Kurtosis2729.149047
Mean30.25312811
Median Absolute Deviation (MAD)4.150632762
Skewness51.34460134
Sum89821.53735
Variance255731.0518
MonotonicityNot monotonic
2022-10-08T13:07:53.731428image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1010
 
0.3%
9.3333333338
 
0.3%
187
 
0.2%
116
 
0.2%
126
 
0.2%
95
 
0.2%
10.857142865
 
0.2%
85
 
0.2%
12.55
 
0.2%
6.55
 
0.2%
Other values (2722)2907
97.9%
ValueCountFrequency (%)
13
0.1%
1.0526315791
 
< 0.1%
1.0555555561
 
< 0.1%
1.1315789471
 
< 0.1%
1.218751
 
< 0.1%
1.2235294121
 
< 0.1%
1.2571428571
 
< 0.1%
1.2604166671
 
< 0.1%
1.281
 
< 0.1%
1.3761467891
 
< 0.1%
ValueCountFrequency (%)
269991
< 0.1%
39061
< 0.1%
20001
< 0.1%
1802.81
< 0.1%
1756.51
< 0.1%
1009.51
< 0.1%
7401
< 0.1%
715.21
< 0.1%
664.61538461
< 0.1%
6571
< 0.1%

total_quantity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1671
Distinct (%)56.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1608.852476
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:53.909431image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile102.4
Q1296
median641
Q31401
95-th percentile4407.4
Maximum196844
Range196843
Interquartile range (IQR)1105

Descriptive statistics

Standard deviation5887.578045
Coefficient of variation (CV)3.659489067
Kurtosis465.998084
Mean1608.852476
Median Absolute Deviation (MAD)422
Skewness17.85859125
Sum4776683
Variance34663575.24
MonotonicityNot monotonic
2022-10-08T13:07:54.133433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
1509
 
0.3%
889
 
0.3%
2468
 
0.3%
2728
 
0.3%
848
 
0.3%
2608
 
0.3%
2888
 
0.3%
12007
 
0.2%
5167
 
0.2%
Other values (1661)2886
97.2%
ValueCountFrequency (%)
11
< 0.1%
22
0.1%
122
0.1%
161
< 0.1%
171
< 0.1%
181
< 0.1%
191
< 0.1%
201
< 0.1%
231
< 0.1%
251
< 0.1%
ValueCountFrequency (%)
1968441
< 0.1%
809971
< 0.1%
802631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct2966
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.89776151
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:54.334465image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.916661099
Q113.11933333
median17.95658654
Q324.98828571
95-th percentile90.497
Maximum56157.5
Range56155.34941
Interquartile range (IQR)11.86895238

Descriptive statistics

Standard deviation1036.934407
Coefficient of variation (CV)19.98033011
Kurtosis2890.707126
Mean51.89776151
Median Absolute Deviation (MAD)5.984842033
Skewness53.44422362
Sum154084.4539
Variance1075232.964
MonotonicityNot monotonic
2022-10-08T13:07:54.498469image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
152
 
0.1%
4.1622
 
0.1%
14.478333332
 
0.1%
18.152222221
 
< 0.1%
13.927368421
 
< 0.1%
36.244117651
 
< 0.1%
29.784166671
 
< 0.1%
22.87926231
 
< 0.1%
20.511041671
 
< 0.1%
149.0251
 
< 0.1%
Other values (2956)2956
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
3202.921
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%

avg_recency_days
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size23.3 KiB

frequency
Real number (ℝ≥0)

Distinct1350
Distinct (%)45.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06327807795
Minimum0.005449591281
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:54.670450image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.009433962264
Q10.01777777778
median0.02941176471
Q30.05540166205
95-th percentile0.2222222222
Maximum3
Range2.994550409
Interquartile range (IQR)0.03762388427

Descriptive statistics

Standard deviation0.1344820641
Coefficient of variation (CV)2.125255198
Kurtosis121.5575473
Mean0.06327807795
Median Absolute Deviation (MAD)0.01433823529
Skewness8.773259386
Sum187.8726134
Variance0.01808542557
MonotonicityNot monotonic
2022-10-08T13:07:54.825432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.166666666721
 
0.7%
0.333333333321
 
0.7%
0.0277777777820
 
0.7%
0.0909090909119
 
0.6%
0.062517
 
0.6%
0.133333333316
 
0.5%
0.416
 
0.5%
0.2515
 
0.5%
0.0238095238115
 
0.5%
0.0357142857115
 
0.5%
Other values (1340)2794
94.1%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055096418731
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
31
 
< 0.1%
21
 
< 0.1%
1.5714285711
 
< 0.1%
1.53
 
0.1%
114
0.5%
0.83333333331
 
< 0.1%
0.751
 
< 0.1%
0.666666666712
0.4%
0.65147453081
 
< 0.1%
0.61
 
< 0.1%

qty_returns
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct214
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.1569552
Minimum0
Maximum80995
Zeros1481
Zeros (%)49.9%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:54.981471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile100.6
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1512.496135
Coefficient of variation (CV)24.33349783
Kurtosis2765.52864
Mean62.1569552
Median Absolute Deviation (MAD)1
Skewness51.79774426
Sum184544
Variance2287644.557
MonotonicityNot monotonic
2022-10-08T13:07:55.139465image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01481
49.9%
1164
 
5.5%
2148
 
5.0%
3105
 
3.5%
489
 
3.0%
678
 
2.6%
561
 
2.1%
1251
 
1.7%
843
 
1.4%
743
 
1.4%
Other values (204)706
23.8%
ValueCountFrequency (%)
01481
49.9%
1164
 
5.5%
2148
 
5.0%
3105
 
3.5%
489
 
3.0%
561
 
2.1%
678
 
2.6%
743
 
1.4%
843
 
1.4%
941
 
1.4%
ValueCountFrequency (%)
809951
< 0.1%
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct1979
Distinct (%)66.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean249.8137641
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:55.309462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.25
median172.3333333
Q3281.6923077
95-th percentile600
Maximum40498.5
Range40497.5
Interquartile range (IQR)178.4423077

Descriptive statistics

Standard deviation791.5551894
Coefficient of variation (CV)3.168581172
Kurtosis2255.538236
Mean249.8137641
Median Absolute Deviation (MAD)83.08333333
Skewness44.67271661
Sum741697.0657
Variance626559.6179
MonotonicityNot monotonic
2022-10-08T13:07:55.469429image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
11410
 
0.3%
739
 
0.3%
829
 
0.3%
869
 
0.3%
608
 
0.3%
888
 
0.3%
758
 
0.3%
1368
 
0.3%
2087
 
0.2%
Other values (1969)2882
97.1%
ValueCountFrequency (%)
12
0.1%
21
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
42821
< 0.1%
39061
< 0.1%
3868.651
< 0.1%
28801
< 0.1%
28011
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%

avg_un_basket_size
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1005
Distinct (%)33.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.1547082
Minimum1
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-08T13:07:55.634432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.345454545
Q110
median17.2
Q327.75
95-th percentile56.94
Maximum299.7058824
Range298.7058824
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation19.51232207
Coefficient of variation (CV)0.8807302672
Kurtosis27.70329723
Mean22.1547082
Median Absolute Deviation (MAD)8.2
Skewness3.499455899
Sum65777.32865
Variance380.7307127
MonotonicityNot monotonic
2022-10-08T13:07:55.787467image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1353
 
1.8%
1439
 
1.3%
1138
 
1.3%
2033
 
1.1%
933
 
1.1%
132
 
1.1%
1731
 
1.0%
1830
 
1.0%
1030
 
1.0%
529
 
1.0%
Other values (995)2621
88.3%
ValueCountFrequency (%)
132
1.1%
1.21
 
< 0.1%
1.251
 
< 0.1%
1.3333333332
 
0.1%
1.58
 
0.3%
1.5681818181
 
< 0.1%
1.5714285711
 
< 0.1%
1.6666666674
 
0.1%
1.8333333331
 
< 0.1%
224
0.8%
ValueCountFrequency (%)
299.70588241
< 0.1%
2591
< 0.1%
203.51
< 0.1%
1481
< 0.1%
1451
< 0.1%
136.1251
< 0.1%
135.51
< 0.1%
1271
< 0.1%
1221
< 0.1%
1181
< 0.1%

Interactions

2022-10-08T13:07:49.345805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:29.136066image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:31.305051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:32.970770image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:34.713674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:36.496749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:38.451782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:40.120743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:41.939004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:43.657891image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:45.682906image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:47.560613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:49.485850image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:29.366055image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:31.439050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:33.108743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:34.863671image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:36.717743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:38.600746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:40.269742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:42.079967image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:43.802888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:45.829899image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:47.714614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:49.625806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:29.535056image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:31.574778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:33.245736image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:35.001686image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:36.862784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:38.737781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:40.419671image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:42.220972image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:44.284893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:45.974906image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:47.864653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:49.765816image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:29.691052image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:31.705786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:33.384594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:35.137724image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:37.008743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:38.869744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:40.568670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:42.360968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:44.415886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:46.118901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:48.013668image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:49.907854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:29.847052image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:31.844779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:33.523595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:35.275741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:37.156742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:39.011749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:40.720668image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:42.502002image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:44.554888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:46.270901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:48.162685image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:50.032807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:29.985050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:31.970771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:33.655595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:35.399737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:37.372744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:39.134742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:40.860676image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:42.632971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:44.683887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:46.420941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:48.299752image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:50.168813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:30.123050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:32.103774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:33.787595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:35.530740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:37.537745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:39.269780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:41.006667image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:42.766008image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:44.819928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:46.572906image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:48.443735image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:50.317814image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:30.278050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:32.252738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:33.933597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:35.669832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:37.713782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:39.417742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:41.167671image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:42.914888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:44.965888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:46.740422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:48.595780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:50.455818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:30.419056image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:32.383742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:34.066595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:35.795855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:37.857744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:39.549745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:41.314675image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:43.048887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:45.099889image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:46.895496image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:48.737797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:50.592806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:30.557088image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:32.526772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:34.201636image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:36.005867image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:38.004743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:39.688743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:41.469668image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:43.188888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:45.236887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:47.057512image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:48.885039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:50.739805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:30.707063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:32.682737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:34.348595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:36.171958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:38.156782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:39.835747image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:41.631704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:43.359893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:45.388890image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:47.245531image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:49.044055image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:50.890816image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:31.163053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:32.832737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:34.556650image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:36.343952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:38.312744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:39.983742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:41.788669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:43.515888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:45.538902image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:47.408570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-08T13:07:49.196853image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-10-08T13:07:55.930426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-08T13:07:56.147511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-08T13:07:56.361515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-08T13:07:56.571548image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-08T13:07:51.120970image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-08T13:07:51.430003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysinvoice_noavg_quantitytotal_quantityavg_ticketavg_recency_daysfrequencyqty_returnsavg_basket_sizeavg_un_basket_size
00178505391.21372.034.05.8350171733.018.15222235 days 12:00:000.48611140.050.9705888.735294
11130473232.5956.09.08.1286551390.018.90403527 days 06:00:000.04878035.0154.44444419.000000
22125836705.382.015.021.6724145028.028.90250023 days 04:30:000.04569950.0335.20000015.466667
3313748948.2595.05.015.678571439.033.86607192 days 16:00:000.0179210.087.8000005.600000
4415100876.00333.03.026.66666780.0292.0000008 days 14:24:000.13636422.026.6666671.000000
55152914623.3025.014.020.6078432102.045.32647123 days 04:48:000.05444129.0150.1428577.285714
66146885630.877.021.011.0733943621.017.21978618 days 07:12:000.073569399.0172.42857115.571429
77178095411.9116.012.033.7213112057.088.71983635 days 16:48:000.03910641.0171.4166675.083333
881531160767.900.091.016.05464538194.025.5434644 days 03:28:000.315508474.0419.71428626.142857
99160982005.6387.07.09.149254613.029.93477647 days 16:00:000.0243900.087.5714299.571429

Last rows

df_indexcustomer_idgross_revenuerecency_daysinvoice_noavg_quantitytotal_quantityavg_ticketavg_recency_daysfrequencyqty_returnsavg_basket_sizeavg_un_basket_size
29595627177271060.2515.01.09.772727645.016.0643946 days 00:00:000.2857146.0645.00000066.0
2960563717232421.522.02.05.638889203.011.70888912 days 00:00:000.1538460.0101.50000018.0
2961563817468137.0010.02.023.200000116.027.4000004 days 00:00:000.4000000.058.0000002.5
2962564913596697.045.02.02.445783406.04.1990367 days 00:00:000.2500000.0203.00000083.0
29635655148931237.859.02.010.945205799.016.9568492 days 00:00:000.6666670.0399.50000036.5
2964565912479473.2011.01.012.733333382.015.7733334 days 00:00:000.33333334.0382.00000030.0
2965568014126706.137.03.033.866667508.047.0753333 days 00:00:001.00000050.0169.3333335.0
29665686135211092.391.03.01.685057733.02.5112414 days 12:00:000.3000000.0244.333333145.0
2967569615060301.848.04.02.183333262.02.5153331 days 00:00:002.0000000.065.50000030.0
2968571512558269.967.01.017.818182196.024.5418186 days 00:00:000.285714196.0196.00000011.0